feature: deferred loading and requirement pruning #1199

leondz · 2025-05-07T11:00:18Z

Support construction-time loading of optional modules. Includes

many generators now using this pattern
pattern for loading modules at run-time and failing if absent
optional requirements moved to pyproject.toml options and pruned from requirements.txt
_load_deps and _clear_deps pattern, used in generator constructor and _load_client / _clear_client
tests to check that optional deps are in the right place and not in the wrong place

Todo / in scope:

Not done:

load/clear deps for plugin types other than generators
gh actions for testing optional components
deeper validation

Out of scope:

handling of versioning outside of pyproject.toml

Resolves #101

…t checks via plugin cache

garak/generators/nemo.py

…lugins

…ception

jmartin-tech · 2025-05-12T15:06:23Z

pyproject.toml

+]
+plugin_replicate = [
+  "replicate>=0.8.3",
+]


Consider adding to allows for pip install garak[plugin_all]:

plugin_all = garak[ plugin_cohere, plugin_langchain, plugin_llava, plugin_litellm, plugin_mistralai, plugin_nemoguardrails, plugin_nemollm, plugin_octoai, plugin_ollama, plugin_replicate, ]

There might be a better name to use for this group.

…l imports

jmartin-tech · 2025-05-16T21:17:53Z

requirements.txt

Should we keep requirements.txt as a consolidated lists of all possible requirements and require usage of pyproject.toml to get a limited dependencies install?

Another option to me would be simply remove requirements.txt and require all installs be based on using pyproject.toml until we implement a lock file.

Right. After migrating the gh workflows to use pyproject.toml, requirements.txt becomes superfluous - I'm leaning toward the latter.

Removal does mean we lose a cross check in test_reqs, but that's OK. Could do with a test to ensure all packages (including optional) specity a version.

erickgalinkin

Looks good to me overall! Just a handful of comments throughout.

erickgalinkin · 2025-05-28T14:29:21Z

garak/_plugins.py

+    # check cache for optional imports
+    if category in PLUGIN_TYPES:
+        extra_dependency_names = PluginCache.instance()[category][full_plugin_name][
+            "extra_dependency_names"
+        ]
+        if len(extra_dependency_names) > 0:
+            for dependency_module_name in extra_dependency_names:
+                for dependency_path in [ # support both plain names and also multi-point names e.g. langchain.llms
+                    ".".join(dependency_module_name.split(".")[: n + 1])
+                    for n in range(dependency_module_name.count(".") + 1)
+                ]:
+                    if importlib.util.find_spec(dependency_path) is None:
+                        _import_failed(dependency_path, full_plugin_name)


Is this really the best way to do this? Perhaps we just enforce lazy loading throughout instead? I'm not sure.

Oh, I guess that is what we're doing. This is the hazard of doing code reviews linearly, I suppose.

erickgalinkin · 2025-05-28T14:34:02Z

garak/generators/base.py

+    def _load_deps(self):
+        # load external dependencies. should be invoked at construction and
+        # in _client_load (if used)
+        for extra_dependency in self.extra_dependency_names:
+            extra_dep_name = extra_dependency.replace(".", "_").replace("-", "_")
+            if (
+                not hasattr(self, extra_dep_name)
+                or getattr(self, extra_dep_name) is None
+            ):
+                setattr(
+                    self,
+                    extra_dep_name,
+                    garak._plugins.load_optional_module(extra_dependency),
+                )
+
+    def _clear_deps(self):
+        # unload external dependencies from class. should be invoked before
+        # serialisation, esp. in _clear_client (if used)
+        for extra_dependency in self.extra_dependency_names:
+            extra_dep_name = extra_dependency.replace(".", "_")
+            setattr(self, extra_dep_name, None)


Should this be in Configurable instead, since it can/should be used across all base classes?

erickgalinkin · 2025-05-28T14:39:08Z

garak/generators/huggingface.py

@@ -158,19 +159,15 @@ class OptimumPipeline(Pipeline, HFCompatible):
    generator_family_name = "NVIDIA Optimum Hugging Face 🤗 pipeline"
    supports_multiple_generations = True
    doc_uri = "https://huggingface.co/blog/optimum-nvidia"
+    extra_dependency_names = ["optimum-nvidia"]


Minor note that has little to do with this PR: it does drive me a bit nuts that the dependency name and the import statement so often do not match.

erickgalinkin · 2025-05-28T14:41:04Z

garak/generators/huggingface.py

    def _load_client(self):
+        self._load_deps()
        if hasattr(self, "generator") and self.generator is not None:
            return


In the interest of DRYness, I notice this exact code repeated across a number of the HFCompatible classes; perhaps we should refactor into HFCompatible?

erickgalinkin · 2025-05-28T14:42:30Z

garak/generators/langchain.py

@@ -53,14 +50,7 @@ def __init__(self, name="", config_root=_config):

        super().__init__(self.name, config_root=config_root)



Are we missing a call to self._load_deps()?

erickgalinkin · 2025-05-28T14:43:29Z

garak/generators/litellm.py

@@ -105,6 +96,7 @@ class LiteLLMGenerator(Generator):
        "skip_seq_start",
        "skip_seq_end",
        "stop",
+        "verbose",
    )

    def __init__(self, name: str = "", generations: int = 10, config_root=_config):


Are we missing a call to self._load_deps()?

erickgalinkin · 2025-05-28T14:45:51Z

garak/generators/nemo.py

@@ -37,6 +36,7 @@ class NeMoGenerator(Generator):

    supports_multiple_generations = False
    generator_family_name = "NeMo"
+    extra_dependency_names = ["nemollm"]

    def __init__(self, name=None, config_root=_config):


Are we missing a call to self._load_deps()?

I keep asking this but I may be missing some logic that's inherited, so my question is very sincere.

erickgalinkin · 2025-05-28T14:46:44Z

garak/generators/octo.py


    def __init__(self, name="", config_root=_config):
-        from octoai.client import Client

        self.name = name
        self._load_config(config_root)
        self.fullname = f"{self.generator_family_name} {self.name}"

        super().__init__(self.name, config_root=config_root)


Are we missing a call to self._load_deps()?

erickgalinkin · 2025-05-28T14:48:08Z

garak/generators/ollama.py

@@ -28,17 +32,18 @@ class OllamaGenerator(Generator):
    active = True
    generator_family_name = "Ollama"
    parallel_capable = False
+    extra_dependency_names = ["ollama"]

    def __init__(self, name="", config_root=_config):


Are we missing a call to self._load_deps()?

I'm going to stop asking every time and assume I've missed some inherited logic.

erickgalinkin · 2025-05-28T14:50:14Z

garak/generators/openai.py

@@ -158,6 +158,7 @@ def __setstate__(self, d) -> object:
    def _load_client(self):
        # When extending `OpenAICompatible` this method is a likely location for target application specific
        # customization and must populate self.generator with an openai api compliant object
+        self._load_deps()


We don't seem to have added the openai dependency as extra_dependency_names. Is that deliberate?

leondz and others added 15 commits May 2, 2025 14:11

draft postponed import pattern for cohere generator

3000d4c

move extra dependency requirements into classdefs, mediate requiremen…

757e0f3

…t checks via plugin cache

actually do the plugin dep load

9310d0a

migrate generators to 'extra dependencies' pattern

dac569e

prune dupe lazyload

35e93fc

extra_dependency_names in all plugins

bf7f36b

active must be False for Probes using extra modules

6a39b0c

make PIL optional in generators.huggingface.LLaVA

56c6182

move optional load fail to ModuleNotFoundError

3657e04

add _load/_clear_deps() into base generator and _load/_clear client

865d604

put the MNFE where it belongs

d61957d

backoff exception placeholder must inherit base exception

8a7051e

test for reqs presence in pyproject.toml, requirements.txt

60775f6

handle hyphen in pypi pkg names

31e98d4

rm optional plugin deps

75babb7

leondz requested a review from jmartin-tech May 7, 2025 11:00

leondz added the architecture Architectural upgrades label May 7, 2025

leondz requested review from jmartin-tech and removed request for jmartin-tech May 7, 2025 11:04

leondz added this to the 0.11.0 milestone May 8, 2025

leondz added 5 commits May 8, 2025 14:19

skip generator tests if optional deps absent

83f551a

support sub-package deps

dd51196

scope optimum to nvidia

b33a46c

move import function to _load_deps

de5b3f1

rm import handling in langchain

19c31fe

jmartin-tech reviewed May 8, 2025

View reviewed changes

garak/generators/nemo.py Outdated Show resolved Hide resolved

leondz self-assigned this May 8, 2025

leondz removed this from the release 0.11.0 milestone May 8, 2025

leondz added 2 commits May 8, 2025 17:39

amend optimum to be nvidia flavour

54fabc5

dry - use garak._plugins.PLUGIN_TYPES as canonical def of 1st class p…

ffac714

…lugins

leondz added 6 commits May 9, 2025 10:28

unify backoff exception pattern mediated via garak GeneratorBackoffEx…

97c8160

…ception

skip instantiation when modules not present

1d4e69c

catch straggling backoff exception wrappings

6164bc5

Merge branch 'main' into update/optional_imports

85fb7c3

use isinstance for exception matching

0402116

don't backoff on 404

e287fe9

jmartin-tech reviewed May 12, 2025

View reviewed changes

leondz mentioned this pull request May 15, 2025

Add audio NIM model and audio probes #1163

Open

leondz added 3 commits May 16, 2025 12:49

merge in our good pal main

6339648

switch to pyproject; get tests deps if testing

76b1774

add [dev] target

ca133e4

leondz requested a review from erickgalinkin as a code owner May 16, 2025 11:01

leondz added 7 commits May 16, 2025 13:04

add required jsonschema that was previously implicit from now-optiona…

8e8a5b9

…l imports

specify versions; move to secure versions cf. NVIDIA#1207

aa7500a

skip internal config mappings for req consistency testing

4f2e5ef

skip test option for non-test workflow

69cfef2

skip ollama tests if no module

a1da5ed

rm spurious dep check

3a8605d

straggling spurious check

d2d17ad

leondz marked this pull request as draft May 16, 2025 12:14

jmartin-tech reviewed May 16, 2025

View reviewed changes

Merge branch 'main' into update/optional_imports

13974b8

erickgalinkin reviewed May 28, 2025

View reviewed changes

		@@ -53,14 +50,7 @@ def __init__(self, name="", config_root=_config):

		super().__init__(self.name, config_root=config_root)

feature: deferred loading and requirement pruning #1199

Are you sure you want to change the base?

feature: deferred loading and requirement pruning #1199

Uh oh!

Conversation

leondz commented May 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

erickgalinkin left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

leondz commented May 7, 2025 •

edited

Loading